312        Bioinformatics

“kaiju_db_refseq_xxxx-xx-xx.tgz”. To classify the short reads in our FASTQ files, you need

to run the following:

mkdir kaiju_output

kaiju -t kaijudb/nodes.dmp \

-f kaijudb/kaiju_db_refseq.fmi \

-i fastq_pure/ERR1823587_pure_R1-80.fastq.gz \

-j fastq_pure/ERR1823587_pure_R2-80.fastq.gz \

-o kaiju_output/ERR1823587.out \

-a greedy \

-z 4 -v

kaiju -t kaijudb/nodes.dmp \

-f kaijudb/kaiju_db_refseq.fmi \

-i fastq_pure/ERR1823601_pure_R1-80.fastq.gz \

-j fastq_pure/ERR1823601_pure_R2-80.fastq.gz \

-o kaiju_output/ERR1823601.out \

-a greedy \

-z 4 -v

kaiju -t kaijudb/nodes.dmp \

-f kaijudb/kaiju_db_refseq.fmi \

-i fastq_pure/ERR1823608_pure_R1-80.fastq.gz \

-j fastq_pure/ERR1823608_pure_R2-80.fastq.gz \

-o kaiju_output/ERR1823608.out \

-a greedy \

-z 4 -v

To learn more about these options, run “kaiju”. The indexing and classification require

around 128 GB RAM. We do not recommend using kaiju unless you have enough memory

and storage space.

After running the program successfully, you will need to convert the kaiju output file

into a summary table using “kaiju2table” command as follows:

kaiju2table -t kaijudb/nodes.dmp \

-n kaijudb/names.dmp \

-r taxonomic_level \

-o kaiju_output/ERR1823587_table.tsv \

kaiju_output/ERR1823587.out \

-l taxonomic,levels,separated,by,commas

kaiju2table -t kaijudb/nodes.dmp \

-n kaijudb/names.dmp \

-r taxonomic_level \

-o kaiju_output/ERR1823601_table.tsv \

kaiju_output/ERR1823601.out \

-l taxonomic,levels,separated,by,commas

kaiju2table -t kaijudb/nodes.dmp \

-n kaijudb/names.dmp \